Filter bank subtraction for robust speech recognition
نویسندگان
چکیده
In this paper, we propose a new technique of filter bank subtraction for robust speech recognition under various acoustic conditions. Spectral subtraction is a simple and useful technique for reducing the influence of additive noise. Conventional spectral subtraction assumes accurate estimation of the noise spectrum and no correlation between speech and noise. Those assumptions, however, are rarely satisfied in reality, leading to the degradation of speech recognition accuracy. Moreover, the recognition improvement attained by conventional methods is slight when the input SNR changes sharply. We propose a new method in which the output values of filter banks are used for noise estimation and subtraction. By estimating noise at each filter bank, instead of at each frequency point, the method alleviates the necessity for precise estimation of noise. We also take into consideration phase differences between the spectra of speech and noise in the subtraction. Recognition experiments on test sets at several SNRs showed that the filter bank subtraction technique improved the word accuracy significantly and got better results than conventional spectral subtraction on all the test sets. In other experiments, on recognizing speech from TV news field reports with environmental noise, the proposed subtraction method yielded better results than the conventional method.
منابع مشابه
Improving the performance of MFCC for Persian robust speech recognition
The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...
متن کاملAn Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition
Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...
متن کاملCepstrum derived from differentiated power spectrum for robust speech recognition
In this paper, cepstral features derived from the differential power spectrum (DPS) are proposed for improving the robustness of a speech recognizer in presence of background noise. These robust features are computed from the speech signal of a given frame through the following four steps. First, the short-time power spectrum of speech signal is computed from the speech signal through the fast ...
متن کاملWavelet Filter Bank Based Robust Speech Enhancement
WAVELET FILTER BANK BASED ROBUST SPEECH ENHANCEMENT L.M. Kadam, D.S. Aldar, and B.B. Godbole K.B.P. College of Engineering and Polytechnic, Satara E-mail: [email protected], [email protected], [email protected] The paper investigate new speech enhancement scheme to meet the demand for quality noise reduction algorithms capable of operating at a very low signal-to noise ratio....
متن کاملA Low-Cost Robust Front-end for Embedded ASR System
In this paper we propose a low-cost robust MFCC feature extraction algorithm which combines noise reduction and voice activity detection (VAD) for automatic speech recognition (ASR) system of embedded applications. To remedy the effect of additive noise a magnitude spectrum subtraction method is used. A VAD is performed to distinguish speech signal from noise signal. It discriminates speech/non...
متن کامل